{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## \u26a0\ufe0f **DEPRECATED**\n",
        "\n",
        "This notebook is deprecated and may no longer be maintained.\n",
        "Please use it with caution or refer to updated resources.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "a7c3286e",
      "metadata": {},
      "source": [
        "# Accelerate VGG19 Inference on Intel\u00ae Gen4 Xeon\u00ae  Sapphire Rapids"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "f6f290e4",
      "metadata": {},
      "source": [
        "## Introduction\n",
        "\n",
        "\n",
        "This example shows a whole pipeline:\n",
        "\n",
        "1. Train an image classification model [VGG19](https://arxiv.org/abs/1409.1556) by transfer learning based on [TensorFlow Hub](https://tfhub.dev) trained model.\n",
        "\n",
        "2. Quantize the FP32 Keras model and get an INT8 PB model using Intel\u00ae Neural Compressor.\n",
        "\n",
        "3. Test and compare the performance of FP32 & INT8 models.\n",
        "\n",
        "This example can be executed on Intel\u00ae CPU supports Intel\u00ae AVX-512 Vector Neural Network Instructions (VNNI) or Intel\u00ae Advanced Matrix Extensions (AMX). There will be more performance improvement on Intel\u00ae CPU with AMX."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "15e562de-b710-497f-b416-a10a61b5d9b8",
      "metadata": {},
      "outputs": [],
      "source": [
        "%env TF_ENABLE_ONEDNN_OPTS=1 ## In case not enabled"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "3d0c7db5",
      "metadata": {},
      "source": [
        "## Import all the required libraries"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "8696c512",
      "metadata": {},
      "outputs": [],
      "source": [
        "%matplotlib inline\n",
        "\n",
        "import os\n",
        "import warnings\n",
        "warnings.filterwarnings('ignore')\n",
        "\n",
        "import numpy as np\n",
        "import matplotlib.pylab as plt\n",
        "import tensorflow as tf\n",
        "import tensorflow_hub as hub\n",
        "import tensorflow_datasets as tfds\n",
        "\n",
        "import neural_compressor as inc\n",
        "print(\"neural_compressor version {}\".format(inc.__version__))\n",
        "\n",
        "import tensorflow as tf\n",
        "print(\"tensorflow {}\".format(tf.__version__))\n",
        "\n",
        "from IPython import display"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "546545cc",
      "metadata": {},
      "source": [
        "## Transfer Learning"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "635c8ee5",
      "metadata": {},
      "source": [
        "### Dataset\n",
        "\n",
        "We use a publicly available dataset [ibean](https://github.com/AI-Lab-Makerere/ibean/) and download from internet. The dataset size is about 170MB which is small enough to be easy download and learn deep learning in short time.\n",
        "\n",
        "It includes leaf images of beans which consist of 3 classes: 2 disease classes and the healthy class. The dataset is divided into 3 parts: train, test, validation.\n",
        "\n",
        "A record include:\n",
        "1. Image: shape (500, 500, 3), data type is uint8\n",
        "2. label: class label (num_classes=3), data type is uint64\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "b50c4d07",
      "metadata": {},
      "outputs": [],
      "source": [
        "# define class number\n",
        "class_num=3\n",
        "\n",
        "def load_raw_dataset():\n",
        "    raw_datasets, raw_info = tfds.load(name = 'beans', with_info = True,\n",
        "                                       as_supervised = True, \n",
        "                                       split = ['train', 'test'])\n",
        "    return raw_datasets, raw_info"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "ee042ad5",
      "metadata": {},
      "source": [
        "### Pre-Trained Model\n",
        "\n",
        "We will download a trained VGG19 FP32 Keras model from TensorFlow Hub. \n",
        "\n",
        "The pre-trained model's input is (32, 32, 3) and output is 10 softmax logits for 10 classes. \n",
        "\n",
        "We need to convert the input image to (32, 32, 3)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "dee6b44f",
      "metadata": {},
      "outputs": [],
      "source": [
        "# define input image size and class number\n",
        "w=h=32"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "d8a68d54",
      "metadata": {},
      "source": [
        "### Build Model\n",
        "\n",
        "We call hub.KerasLayer() to download the pre-trained model and wraps it as a Keras Layer.\n",
        "\n",
        "We disable the training capability of the trained FP32 model part, and add 3 tf.keras.layers.Dense layers and 2 tf.keras.layers.Dropout layers. The final tf.keras.layers.Dense is with class number of the data and  activation function **softmax**.\n",
        "\n",
        "During the training, only the added layers are training. With the feature extractor function of pre-trained layers, it's easy to train the model in short time with the custom dataset in short time."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "a30fd6d8",
      "metadata": {},
      "outputs": [],
      "source": [
        "def build_model(w, h, class_num):\n",
        "    url = \"https://www.kaggle.com/models/deepmind/ganeval-cifar10-convnet/frameworks/TensorFlow1/variations/ganeval-cifar10-convnet/versions/1\"\n",
        "    feature_extractor_layer = hub.KerasLayer(url, input_shape = (w, h, 3))\n",
        "    feature_extractor_layer.trainable = False\n",
        "\n",
        "    model = tf.keras.Sequential(\n",
        "        [\n",
        "            feature_extractor_layer,\n",
        "            tf.keras.layers.Dense(256, activation = 'relu'),\n",
        "            tf.keras.layers.Dropout(0.4),\n",
        "            tf.keras.layers.Dense(256, activation = 'relu'),\n",
        "            tf.keras.layers.Dropout(0.4),            \n",
        "            tf.keras.layers.Dense(class_num, activation = 'softmax')\n",
        "        ]\n",
        "    )\n",
        "\n",
        "    model.summary()\n",
        "\n",
        "    model.compile(\n",
        "        optimizer = tf.keras.optimizers.Adam(),\n",
        "        loss = tf.keras.losses.CategoricalCrossentropy(from_logits = True),\n",
        "        metrics = ['acc']\n",
        "    )    \n",
        "    return model\n",
        "\n",
        "model = build_model(w, h, class_num)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "4446c803",
      "metadata": {},
      "source": [
        "### Data Preprocessing\n",
        "\n",
        "The pre-trained model's input shape is (32, 32, 3), so we must resize the input of dataset to same shape for transfer learning.\n",
        "\n",
        "The raw input data is INT8 type, we need to convert it to FP32."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "2bf7ddb6",
      "metadata": {},
      "outputs": [],
      "source": [
        "def preprocess(image, label):\n",
        "    image = tf.cast(image, tf.float32)/255.0\n",
        "    return tf.image.resize(image, [w, h]), tf.one_hot(label, class_num)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "52fed909",
      "metadata": {},
      "source": [
        "### Dataset Loader"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "85c268bf",
      "metadata": {},
      "outputs": [],
      "source": [
        "def load_dataset(batch_size = 32):\n",
        "    datasets, info = load_raw_dataset()\n",
        "    return [dataset.map(preprocess).batch(batch_size) for dataset in datasets]"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "3bca7e14",
      "metadata": {},
      "source": [
        "### Training Model\n",
        "\n",
        "Train the model with 5 epochs."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "7628e428",
      "metadata": {},
      "outputs": [],
      "source": [
        "def train_model(model, epochs=1):\n",
        "    train_dataset, test_dataset = load_dataset()\n",
        "    hist = model.fit(train_dataset, epochs = epochs, validation_data = test_dataset)\n",
        "    result = model.evaluate(test_dataset)\n",
        "    \n",
        "epochs=5\n",
        "train_model(model, epochs)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "38f509f4",
      "metadata": {},
      "source": [
        "### Save Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "9e05dab7",
      "metadata": {},
      "outputs": [],
      "source": [
        "def save_model(model, model_path):    \n",
        "    model.save(model_path)\n",
        "    print(\"Save model to {}\".format(model_path))\n",
        "    \n",
        "model_fp32_path=\"model_keras.fp32\"\n",
        "save_model(model, model_fp32_path)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "d1075930",
      "metadata": {},
      "source": [
        "### Test Model on Single Image"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "a7a24b77",
      "metadata": {},
      "outputs": [],
      "source": [
        "%matplotlib inline\n",
        "\n",
        "import matplotlib.pylab as plt\n",
        "import numpy as np\n",
        "\n",
        "\n",
        "def verify_single_image(model, test_dataset, info):\n",
        "    for sample in datasets[-1].take(1):\n",
        "        [image, label] = sample\n",
        "        image_fp32, label_arr = preprocess(image, label)\n",
        "        image_fp32 = np.expand_dims(image_fp32, axis = 0)\n",
        "        pred = model(image_fp32)\n",
        "\n",
        "\n",
        "        plt.figure()\n",
        "        plt.imshow(image)\n",
        "        plt.show()\n",
        "\n",
        "        print(\"Actual Label : %s\" %info.features['label'].names[label.numpy()])\n",
        "        print(\"Predicted Label : %s\" %info.features['label'].names[np.argmax(pred)])\n",
        "        \n",
        "datasets, info = load_raw_dataset()\n",
        "verify_single_image(model, datasets[-1], info)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2f042e32",
      "metadata": {},
      "source": [
        "## Model Quantization using Intel\u00ae Neural Compressor(INC)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "283be9ad",
      "metadata": {},
      "source": [
        "### Custom Dataset\n",
        "\n",
        "The custom dataset class must provide two methods: `__len__()` and `__getitem__()`.\n",
        "\n",
        "In this case, use the integrated metric function in this tool. So the dataset format must follow the requirement of default metric function. So the label format is class index, instead of categorical vector (one-hot encoding)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "b276ba91",
      "metadata": {},
      "outputs": [],
      "source": [
        "def preprocess_1(image, label):\n",
        "    image = tf.cast(image, tf.float32)/255.0\n",
        "    return  tf.image.resize(image, [w, h]), label  \n",
        "\n",
        "\n",
        "class Dataset(object):\n",
        "    def __init__(self):\n",
        "        datasets , info = load_raw_dataset()        \n",
        "        self.train_dataset = [preprocess_1(v, l) for v,l in datasets[0]]\n",
        "    \n",
        "    def __getitem__(self, index):\n",
        "        return self.train_dataset[index]\n",
        "\n",
        "    def __len__(self):\n",
        "        return len(list(self.train_dataset))\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "dbf5f760",
      "metadata": {},
      "source": [
        "### Quantization "
      ]
    },
    {
      "cell_type": "markdown",
      "id": "d57381b5",
      "metadata": {},
      "source": [
        "#### Quantization Plus BF16 on Sapphire Rapids (SPR) (Optional)\n",
        "\n",
        "If you want to try Quantization Plus BF16 on **none SPR**, please enable it forcely.\n",
        "\n",
        "The quantized model can be accelerated when run inference on SPR.\n",
        "\n",
        "```\n",
        "import os\n",
        "os.environ[\"FORCE_BF16\"] = \"1\"\n",
        "os.environ[\"MIX_PRECISION_TEST\"] = \"1\"\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "eb526f9a",
      "metadata": {},
      "source": [
        "#### Quantization using Intel\u00ae Neural Compressor(INC) API\n",
        "\n",
        "Create the dataloader by custom data defined above. Call Intel\u00ae Neural Compressor API to quantize the FP32 model.\n",
        "\n",
        "The executing time depends on the size of dataset and accuracy target."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "18371fad",
      "metadata": {},
      "source": [
        "#### Execute to Quantize on Local SPR server."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "90fcba50",
      "metadata": {},
      "outputs": [],
      "source": [
        "from neural_compressor.data import DataLoader\n",
        "from neural_compressor.quantization import fit\n",
        "from neural_compressor.config import PostTrainingQuantConfig, AccuracyCriterion\n",
        "from neural_compressor import Metric\n",
        "\n",
        "\n",
        "def auto_tune(input_graph_path, batch_size, int8_pb_file):\n",
        "    dataset = Dataset()\n",
        "    dataloader = DataLoader(framework='tensorflow', dataset=dataset, batch_size = batch_size)\n",
        "    \n",
        "    #Define accuracy criteria and tolerable loss\n",
        "    config = PostTrainingQuantConfig(\n",
        "    accuracy_criterion = AccuracyCriterion(\n",
        "      higher_is_better=True, \n",
        "      criterion='relative',  \n",
        "      tolerable_loss=0.01  \n",
        "      )\n",
        "    )\n",
        "\n",
        "    top1 = Metric(name=\"topk\", k=1)\n",
        "    \n",
        "    q_model = fit(\n",
        "        model=input_graph_path,\n",
        "        conf=config,\n",
        "        calib_dataloader=dataloader,\n",
        "        eval_dataloader=dataloader,\n",
        "        eval_metric=top1\n",
        "        )\n",
        "\n",
        "    return q_model\n",
        "\n",
        "\n",
        "\n",
        "batch_size = 32\n",
        "model_fp32_path=\"model_keras.fp32\"\n",
        "int8_pb_file = \"model_pb.int8\"\n",
        "q_model = auto_tune(model_fp32_path,  batch_size, int8_pb_file)\n",
        "q_model.save(int8_pb_file)\n",
        "print(\"Save quantized model to {}\".format(int8_pb_file))"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "436d91ea",
      "metadata": {},
      "source": [
        "## Test the Performance & Accuracy\n",
        "\n",
        "We use same script to test the performance and accuracy of the FP32 and INT8 models.\n",
        "\n",
        "Use 4 CPU cores to test process.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2eb288d5",
      "metadata": {},
      "source": [
        "### Execute profiling_inc.py for Inference results on Local SPR server."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "eae30445-42c9-4b22-afc5-073c38a9e694",
      "metadata": {},
      "source": [
        "#### Note: It's recommended to provide full python env path in the notebook. Please change it accordingly"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "f85989cb",
      "metadata": {},
      "source": [
        "#### Test FP32 Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "84652afe",
      "metadata": {
        "scrolled": true
      },
      "outputs": [],
      "source": [
        "%%time\n",
        "!numactl -C 0-3 ~/.conda/envs/env_inc/bin/python profiling_inc.py --input-graph=./model_keras.fp32 --omp-num-threads=4 --num-inter-threads=1 --num-intra-threads=4 --index=32"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "c7bb2513",
      "metadata": {},
      "source": [
        "#### Test INT8 Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "8ea84db4",
      "metadata": {
        "scrolled": true
      },
      "outputs": [],
      "source": [
        "%%time\n",
        "!numactl -C 0-3 ~/.conda/envs/env_inc/bin/python profiling_inc.py --input-graph=./model_pb.int8 --omp-num-threads=4 --num-inter-threads=1 --num-intra-threads=4 --index=8"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "27c83895",
      "metadata": {},
      "source": [
        "### Compare the Results"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "1f93cfec",
      "metadata": {},
      "outputs": [],
      "source": [
        "!~/.conda/envs/env_inc/bin/python compare_perf.py"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "e3ce0f97",
      "metadata": {},
      "source": [
        "Show result by graphic."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "68e47d64",
      "metadata": {},
      "outputs": [],
      "source": [
        "from IPython.display import Image, display\n",
        "\n",
        "listOfImageNames = ['fp32_int8_aboslute.png',\n",
        "                    'fp32_int8_times.png']\n",
        "\n",
        "for imageName in listOfImageNames:\n",
        "    display(Image(filename=imageName))"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9f014f7d",
      "metadata": {},
      "source": [
        "# Citation"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "5b2dd3fb",
      "metadata": {},
      "source": [
        "```\n",
        "@ONLINE {beansdata,\n",
        "    author=\"Makerere AI Lab\",\n",
        "    title=\"Bean disease dataset\",\n",
        "    month=\"January\",\n",
        "    year=\"2020\",\n",
        "    url=\"https://github.com/AI-Lab-Makerere/ibean/\"\n",
        "}\n",
        "```"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "e0fbd1da-2731-4398-a327-9908c87c8c5f",
      "metadata": {},
      "outputs": [],
      "source": [
        "!which python "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "ae14f078-3414-45c5-bb98-b77eb33c0070",
      "metadata": {},
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "env_inc",
      "language": "python",
      "name": "env_inc"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.9.18"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}